Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

نویسندگان

  • Zeyuan Allen Zhu
  • Zheng Qu
  • Peter Richtárik
  • Yang Yuan
چکیده

Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems. Up to a primal-dual transformation, it is also the same as accelerated stochastic gradient descent that is one of the central methods used in machine learning. In this paper, we improve the best known running time of accelerated coordinate descent by a factor up to √ n. Our improvement is based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter. Our proof technique also deviates from the classical estimation sequence technique used in prior work. Our speed-up applies to important problems such as empirical risk minimization and solving linear systems, both in theory and in practice.1

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallelizable algorithms for X-ray CT image reconstruction with spatially non-uniform updates

Statistical image reconstruction methods for X-ray CT provide good images even for reduced dose levels but require substantial compute time. Iterative algorithms that converge in fewer iterations are preferable. Spatially non-homogeneous iterative coordinate descent (NH-ICD) accelerates convergence by updating more frequently the voxels that are predicted to change the most between the current ...

متن کامل

Faster Optimization through Adaptive Importance Sampling

The current state of the art stochastic optimization algorithms (SGD, SVRG, SCD, SDCA, etc.) are based on sampling one active datapoint uniformly at random in each iteration. Changing these probabilities to better reflect the importance of each datapoint is a natural and powerful idea. In this thesis we analyze Stochastic Coordinate Descent methods with fixed non-uniform and adaptive sampling. ...

متن کامل

Leverage Score Sampling for Faster Accelerated Regression and ERM

Given a matrix A ∈ R and a vector b ∈ R, we show how to compute an ǫ-approximate solution to the regression problemminx∈Rd 1 2 ‖Ax−b‖2 2 in time Õ((n+ √ d · κsum)·s·log ǫ) where κsum = tr ( AA ) /λmin(A A) and s is the maximum number of non-zero entries in a row of A. Our algorithm improves upon the previous best running time of Õ((n+ √ n · κsum) · s · log ǫ). We achieve our result through a ca...

متن کامل

Smooth Primal-Dual Coordinate Descent Algorithms for Nonsmooth Convex Optimization

We propose a new randomized coordinate descent method for a convex optimization template with broad applications. Our analysis relies on a novel combination of four ideas applied to the primal-dual gap function: smoothing, acceleration, homotopy, and coordinate descent with non-uniform sampling. As a result, our method features the first convergence rate guarantees among the coordinate descent ...

متن کامل

Lsh-sampling Breaks the Computa- Tional Chicken-and-egg Loop in Adap- Tive Stochastic Gradient Estimation

Stochastic Gradient Descent or SGD is the most popular optimization algorithm for large-scale problems. SGD estimates the gradient by uniform sampling with sample size one. There have been several other works that suggest faster epoch wise convergence by using weighted non-uniform sampling for better gradient estimates. Unfortunately, the per-iteration cost of maintaining this adaptive distribu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016